University of Sheffield: description of the LaSIE system as used for MUC-6

نویسندگان

  • Robert J. Gaizauskas
  • Kevin Humphreys
  • Hamish Cunningham
  • Yorick Wilks
چکیده

The LaSIE (Large Scale Information Extraction) system has been developed at the University of Sheffiel d as part of an ongoing research effort into information extraction and, more generally, natural languag e engineering . LaSIE is a single, integrated system that builds up a unified model of a text which is then used t o produce outputs for all four of the MUC-6 tasks . Of course this model may also be used for other purposes aside from MUC-6 results generation, for example we currently generate natural language summaries of th e MUC-6 scenario results . Put most broadly, and superficially, our approach involves compositionally constructing semantic representations of individual sentences in a text according to semantic rules attached to phrase structure constituents which have been obtained by syntactic parsing using a corpus-derived context-free grammar . The semantic representations of successive sentences are then integrated into a `discourse model' which, once th e entire text has been processed, may be viewed as a specialisation of a general world model with which th e system sets out to process each text . LaSIE has a historical connection with the University of Sussex MUC-5 system [GCE93] from which it de rives its approach to world modelling and coreference resolution and its approach to recombining fragmente d semantic representations which result from partial grammatical coverage . However, the parser and grammar differ significantly from those used in the Sussex system. In its approach to named entity identification LaSI E borrows to some extent from the approach adopted in the MUC-5 Diderot system [CGJ +93] . Virtually al l of the code in LaSIE is new and has been developed since January 1995 with about 20 person-months o f effort .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

University of Sheffield: Description of the LaSIE-II System as Used for MUC-7

The University of She eld NLP group took part in MUC-7 using the LaSIE-II system, an evolution of the LaSIE (Large Scale Information Extraction) system rst created for participation in MUC-6 [9] and part of a larger research e ort into information extraction underway in our group. LaSIE-II was used to carry out all ve of the MUC-7 tasks and was, in fact, the only system to take part in all of t...

متن کامل

American University in Cairo: Description of the American University in Cairo's System Used for MUC-7

Portions of the American University in Cairo's MUC-7 system, MUC7-Plink, have participated in every Message Understanding Competition since MUC-4. The Plink parser was developed at the University of Michigan where it formed the core of the systems entered in MUC-4 [2] and MUC-5 [1]. Recently, the Plink parser was added to GATE [6] to facilitate interaction between language processing modules. M...

متن کامل

Quantitative Evaluation of Coreference Algorithms in anInformation Extraction

Algorithms for performing coreference resolution can only be precisely evaluated given a benchmark corpus of coreference-annotated texts, together with techniques for evaluating the algorithms' output against the corpus. Such a corpus and such techniques have become available for the rst time as part of the Message Understanding Conference 6 (MUC-6) evaluations of information extraction systems...

متن کامل

University of Durham: Description of the LOLITA system as Used in MUC-7

LOLITA has been designed in such a way that the code implementing the MUC tasks is only a small part of the whole system. A core system provides complex facilities with the MUC system being built so that it utilises these facilities. Hence, after some background to the LOLITA project, the ‘core’ of LOLITA is described. This system description is substantially similar to that given for MUC-6 [1]...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995